ON THE POLYAK CONVEXITY PRINCIPLE AND ITS APPLICATION TO 

VARIATIONAL ANALYSIS 



A. UDERZO 

Abstract. According to a result due to B.T. Polyak, a mapping between Hilbert spaces, which is C 1,1 around 
a regular point, carries a ball centered at that point to a convex set, provided that the radius of the ball is small 
enough. The present paper considers the extension of such result to mappings defined on a certain subclass 
of uniformly convex Banach spaces. This enables one to extend to such setting a variational principle for 
constrained optimization problems, already observed in finite dimension, that establishes a convex behaviour 
for proper localizations of them. Further variational consequences are explored. 



1. Introduction 

The source of several deep results and intriguing problems in nonlinear analysis can be found, to an attentive 
view, in the proficuous interplay between smoothness and convexity. Sometimes, there is some smoothness 
hidden in convexity. The generic (in fact, Gs dense) Gateaux differentiability of convex continuous functions 
defined on separable Banach spaces, which was established by Mazur in 1933, paved the way to a fruitful 
research line culminating with the theory of Asplund spaces (see [TTJ [T^] ) . Symmetrically, some convexity 
is hidden in smoothness. Indeed, smoothness at various levels provides powerful and widely exploited criteria 
for detecting convexity of functions. Another issue arosen within this interplay is how to recognize convexity 
of images through smooth mappings. In fact, only a few classes of mappings between vector spaces are known, 
of course besides the linear ones, to guarantee convexity of images of convex subsets of the domain space. 
Yet, such a question seems to be of crucial interest in connection with optimization and control theory related 
topics. For instance, the famous Lyapunov's convexity theorem on the range of a nonatomic finite dimensional 
vector measure found a relevant application in the formulation of the "bang-bang" principle, a fundamental 
result in control theory, as well as in several areas of mathematical economics. 

Historically, it seems not to be so easy to trace back an origin for the problem of recognizing convexity of 
images of sets under mappings. In this concern, one should not omit to mention the studies on the convexity of 
images of spheres through vector quadratic forms, which were triggered by the Toepliz-Hausdorff theorem (see 
[Tlll3j and references therein). A significant step towards a theory embracing wide classes of mappings between 
abstract spaces was made with the appearance of a result due to B.T. Polyak (see [HI [15]). He succeeded in 
proving that C ' mappings between Hilbert spaces, which are regular at a given point, carry balls centered at 
that point to convex sets (with nonempty interior), provided the radius of the balls are sufficiently small. The 
opinion maintained by the author of the present paper is that, despite its importance, such result (henceforth 
referred to as the Polyak convexity principle) has not received so far an adequate attention, deserving instead 
a major popularization, especially among researchers working in variational analysis and optimization areas. 
The aim of the present paper is therefore to contribute to stimulate further developments on this subject. This 
is done by showing that the validity of the Polyak convexity principle is not limited to the Hilbert space setting, 
but it can be extended to a certain subclass of uniformly convex Banach spaces. These form a well-known 
subclass of reflexive Banach spaces and are characterized by the rotund shape of their balls, quantitatively 
described by their respective moduli of convexity. A key element which makes possible the extension to such 
setting of the aforementioned principle is a condition on the asymptotic behaviour of the modulus of convexity, 
to be combined with the smoothness assumption on the given mapping. This is because, as already remarked 
by Polyak, the convexity principle is not able to preserve convexity of images of general subsets, but, relying 
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on approximation/perturbation techniques of variational analysis, it needs a certain "rotund geometry" on 
the domain space, which is able to guarantee a "stable form" of convexity. In fact, a possible way of looking 
at the Polyak principle is as at a solvability result on smooth equations, with the known term subject to 
perturbations restricted to a convex set. 

A remarkable consequence of the Polyak convexity principle is that, to a certain extent, C 1,1 smoothness 
accompained by regularity yields a local convex behaviour of mappings. This fact may not be so striking, if 
taking into account the nice characterization of C ' functions in Hilbert spaces found by Hiriart-Urruty and 
Plazanet in 1989 (see [5])- According to it, a function <j> : H — > K defined on a Hilbert space (H, || • ||) is 
C 1 ' iff there is some positive constant a such that <f> + a|| • || 2 and —<f> + a\\ ■ || 2 are both convex functions. In 
particular, C 1,1 functions are known to be difference of convex functions (notice, again a manifestation of the 
interplay between smoothness and convexity). 

The benefic effects of the convexity hidden in smoothness can be evidently appreciated when dealing with 
optimization problems. On this theme, Polyak himself observed, on the base of the convexity principle, that 
nonlinear problems in mathematical programming with C 1,1 data behave like convex programs near regular 
feasible points. It would be useful that such result, having notable consequences both from the theoretical 
and computational point of view, could be extended far beyond the finite dimensional setting, in which has 
been first presented (see [HI [15]). An attempt to proceed in this direction is made in the second part of the 
present work. The convexity property of images of convex sets under smooth mappings has been investigated 
also in [2], where nonlocal sufficient conditions are proposed in a finite dimensional setting. 

The material exposed in the present paper is arranged in four main sections, included the current one. 
Section [5] collects miscellaneous notions from nonlinear analysis and geometrical theory of Banach spaces. 
Related technical facts, which are needed in the subsequent analysis, are established. The main result, that 
is an extension of the Polyak convexity principle to an adequate Banach space setting, is presented and 
discussed in Section [3] Section U is reserved to provide applications of the main result to some topics of 
nonlinear optimization in Banach spaces. In particular, a variational principle on the convex behaviour of 
proper localizations of constrained extremum problems with C 1,1 data is derived. Its consequences on the 
Lagrangian duality and on problem calmness are subsequently explored. 



2. Notations and preliminaries 



Throughout the paper, whenever (X, || • ||) is a Banach space, B (x, r) denotes the ball with centre at x € X 
and radius r > 0. The same notation is used also for balls in metric spaces. The null vector of X is marked by 
0. The unit ball, i.e. the set B (0, 1), is simply denoted by B, whereas the unit sphere by §. Given x\ x 2 € X, 
the closed line segment with endpoints x\ and X2 is indicated by [x±, x 2 \. If S is a subset of a Banach space, 
int 5, bd S and c\S denote the interior, the boundary and the (topological) closure of S, respectively. 

2.1. Uniformly convex Banach spaces and their moduli. Given a Banach space (X, || • ||), some features 
of the geometry of X, related to the rotundity of its ball, can be quantitatively described by means of the 
function S x ■ [0, 2] — > R, defined by 



<5x(e) = inf 1 





Xi + x 2 




2 



xi, X2 e 



xi - x 2 > e 



which is called the modulus of convexity of (X, || ■ ||). It is possible to prove that the modulus of convexity of 
a given Banach space admits the following equivalent representations 



<5x(e) 





Xi 


+ 


x 2 








2 




{'- 


Xl 


+ 


x 2 








2 
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xi, x 2 e S, ||xi - x 2 || = 



(see, for instance, [7]). 

Definition 2.1. A Banach space (X, || • ||) is called uniformly convex if it is <5x( e ) > for every e G (0, 2]. 
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Example 2.2. All Hilbert spaces are uniformly convex. Indeed, by a straightforward application of the 
parallelogram law it is possible to show that, if (H, || ■ ||) is a Hilbert space, then it results in 



8 n (e) = l-^l-j, Vee[0,2]. 

The Banach spaces l p , L p , and are known to be uniformly convex if 1 < p < oo. In particular, if p > 2 
their respective moduli of convexity can be explicitly calculated. They turn out to be 

i/p 



Sip(e) = S LP (e) = 8 w v^ (e) = 1 



1 



Ve e [0,2]. 



If 1 < p < 2, relying on the asymptotic behaviour of the modulus of convexity, the following estimate from 
below is known to hold 

5ip(e) = 6 LP (e) = 5 w ^(e) > ^"^e 2 , Ve G (0,2]. 

Example 2.3. As a consequence of the James' characterization of weak compactness, one can deduce that, 
if a Banach space is uniformly convex, then it must be reflexive. By consequence, such spaces as cq, L 1 and 
L°° fail to be uniformly convex. 

For the purposes of the present investigations, a geometrical property of a special subclass of uniformly 
convex spaces is needed. Loosely speaking, such property prescribes a quadratic estimate from below for the 
distance of the middle point of two elements in a ball from the boundary of that ball. Not surprisingly, a 
sufficient condition for the validity of such an estimate can be given in terms of modulus of convexity. 

Lemma 2.4. Let (X, || • ||) be a uniformly convex Banach space. Suppose that its modulus of convexity fulfils 
the condition 

(2.1) S x (e)>ce 2 , Vee[0,2], 

for some c > 0. Then, for every Xq, x±, #2 € X and r > 0, with ii,i 2 6B (xo,r), it holds 

'x 1 +x 2 c||xi - a^ll 2 



B 



CB(i ,r). 



Proof. Fix r > 0. Since the distance induced on X by || • || is invariant under translations, without loss of 
generality it is possible to assume that xq = 0. By using one of the possible representations of 8x, one has 



inf 

x 1 ' x 2 ^ r ^ 
||=r, 



X\ + X 2 



By virtue of condition (|2.1j) one obtains 



r > sup 

a; i , x 2 e t-B 



inf 

ui-« 2 ll = « 



X\ + x 2 



Ul + U 2 



rSx(e), Vee[0,2] 



rce 2 , Ve€[0,2]. 



This amounts to say that 



X\ + x 2 



c\\xi - x 2 \\ 



< r, Vxi, x 2 € 



\x\ — x 2 \\ = re. 



Since the lasty inequality is true for every t £ [0, 2], it follows 

c||xi - x 2 \ 2 



x\ + x 2 



< r, Vxi, x 2 G 



Thus, by applying the triangle inequality, whenever ieB (^ Xl ^ x 



2 cllzi— g2jl' 



one obtains 



11*11 < 

which completes the proof. 
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Remark 2.5. Notice that, whenever (X, || • ||) is in particular a Hilbert space, in the light of what has been 
noted in Example 12.21 condition (|2.1[) turns out to be satisfied with c = 1/8. Besides, all spaces V , L p , and 
Wftt, with 1 < p < 2, admits a modulus of convexity satisfying (|2.ip with c = . 

For further details on the theory of uniformly convex Banach spaces and their moduli of convexity the 
reader is referred to [10] . 



2.2. Some properties of C 1,1 mappings. Let (X, || • ||) and (Y, || • ||) be Banach spaces. The Banach space of 
all bounded linear operators bewteen X and Y, equipped with the operator norm, is denoted by (£(X, Y), || -\\c)- 
The space £(X, K) is simply marked by X*, with (■,■): X* x X — ► R denoting the duality pairing X* with 
X. The null vector of a dual space is marked by 0*. If S C X is a nonempty set, S G = {x* e X* : (x*,x) < 
0, Vx € 5} represents the negative dual cone of S, while S ± = S" G n (— S e ) the annihilator of S. If £0 G 5, 
N(xo, S) stands for the normal cone of S at Xo in the sense of convex analysis. Given a mapping / : to — 5- Y, 
with to open subset of X, and given xq £ to, the Frechet derivative of / at xo is denoted by Df(xo) £ £(X, Y). 
If / is Frechet differentiable at xq, the remainder in its first-order expansion is denoted by o (2:0; •), i.e. 

o (x ; h) = f(x + h)- f(x ) - Bf(x )[h], hex, x Q + heto. 

If a mapping / : to — > Y is Frechet differentiable at each point of to and the mapping D/ : to — > £(X, Y) is 
Lipschitz continuous on to, / is said to be C 1 ' on to. The space of all such mappings is indicated by C 1,1 (il). 
If / € C 1,1 (il), the infimum over all values k > such that 

||D/(xi) - ~Df{x 2 )\\c < k\\xi - x 2 \\, Vxi, x 2 € to, 

is called modulus of Lipschitz continuity of D/ on to and is indicated by lip (D/; Jl). 

The proof of a lemma useful in the sequel involves elements of integral calculus for mappings between 
Banach spaces. In this concern, take into account that, given a compact interval [a, 6] CI and / : [a, b] — 5- Y, 

its integral over [a, b], denoted by f(t) dt, is to be intended in the sense of Gavurin. Roughly speaking, this 
means that such integral can be defined by partitioning [a, b] into finitely many subintervals and by taking 
the limit of the integral sum as the partition mesh width goes to 0. It has been shown that every continuous 
mapping is integrable in this sense. As a further step, given a mapping G : X — > £(X,Y) and xq, h € X, 
define 

l-xo+h rl 

G(x)dx = / G(x + th)[h]dt. 



1 x J U 

In such setting, an analogous of the fundamental theorem of classical integral calculus can be stated as follows. 

Theorem 2.6. Let f : X — > Y be a mapping between Banach spaces, let to be an open subset o/X, and let 
xq e $1, feeS with [xq,xq +h]C.to. If f e C 1 (il), then f*° Df(x)dx exists and 

xo+h 

Bf(x) dx = f(x + h)- f(x ). 

Xq 

For more details on this topic see [S] and references therein. One is now in a position to establish a lemma, 
which will be used in the subsequent section. 



Lemma 2.7. Let f : X — > Y be a mapping between Banach spaces, let to be an open subset of X, and let 

L+a 
2 



x\, X2 € to, with [xi,x 2 ] Cft. If f £ C 1 ' 1 ^) and x = Xl + X2 , then it holds 



\\o(x; Xl -x 2 ) ||< Hp ^ 8 /5n) ||si-S2ll 2 - 

Consequently, 



(2.2) 



f( X l) + f( X 2) _ j, f Xl+X 2 



< ^ \\xi-x 2 \\ . 
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Proof. The first assertion is a well-known result, whose proof follows a standard argument based on Theorem 
12.61 and is provided here for the sake of completeness. From the first-order expansion of function / near x, one 
obtains 

(D/(i + t(xi-S))-D/(i)) 



o (x; Xi - x 2 ) = 



< 



f( Xl )-f(x)-Df(x) 
x)) 



Xi — x 2 








2 






L 



xi - x 2 



dt 



\Df(x + t(x! 
lip(D/;fi) 



D/(5)|| 



X\ — X2 



dt < \\xi 



X 2 \ 



tdt 



\xi - x 2 \\" 



As for the second assertion, by adding up the two below first-order expansions of mapping / at x 

f(xi) = f(x) + T)f(x)[xi -x]+o(x;Xi-x), i=l, 2, 
and dividing by 2, one gets 



f{x 1 ) + f{x 2 ) 



D/(5) 



x\ - x 2 



D/(aO 



x 2 - xi 



O X 



xi - x 2 



x 2 - x\ 



Thus, by linearity of the derivative Df(x), one obtains 



f(x 1 ) + f(x 2 ) 



m 



< 



o x; 



Xi - x 2 



O X 



x 2 - x\ 



The inequality to be proved can be easily derived from the last one by taking into account the estimate 
provided in the first part of the thesis. □ 

2.3. Metric regularity and linear openness. A key assumption playing a crucial role in the proof of the 
main result is local metric regularity. Recall that a mapping / : X — > Y between Banach spaces is said to be 
metrically regular around (xq, /(xq)), with xq £ X, if there exist positive <5, £ and /i such that 

dist(x]f- 1 {y)) <ii\\y-f(x)\\, VxeB(x ,6), Vy e B(/(i ),C) . 

where dist (x; S) = inf se s \\s — x\\ denotes the distance of x from set S. For mappings which are strictly 
diffcrcntiable at xq (and hence, for mappings C • in an open neighbourhood of xq) the following celebrated 
criterion for metric regularity holds (see, for instance, Theorem 1.57 in 

Theorem 2.8 (Lyusternik-Graves). Let f : X — > Y be a mapping between Banach spaces. Suppose f to 
be strictly differentiable at xq £ X. Then f is metrically regular around (xo,f(xo)) iffDf(xo) € £(X,Y) is 
onto. 

An equivalent reformulation of metric regularity will be also exploited in the sequel, which refers to a 
local surjection property known as openness at a linear rate around (a;o, f{xo)). It postulates the existence of 
positive (5, £ and a such that 

(2.3) /(B(af,r))DB(/(x),(rr)nB(/(aro),0, Va; e B (xo,S) , W e [0, S). 

Actually, metric regularity and openness at a linear rate describe a Lipschitzian behaviour of mappings, which 
can be considered in the more general setting of metric spaces. Given a mapping / : X — > Y between metric 
spaces, for the purposes of the present analysis it is convenient to recall also the notion of openness with 
respect to a given subset S C X. Mapping / is said to be open at a linear rate on 5* if there exists a > such 
that for every x £ S and every r > 0, with B (a;, r) C S, it holds 

/(B (x,r))DB(f(x),ar). 

A relevant consequence that openness on a given subset bears, already if considered in metric spaces, is stated 
in the next lemma, whose proof can be found for instance in [B]. 

Lemma 2.9. Let f : X — > Y be a mapping between metric spaces and let S C X . Suppose that: 

(i) X is metrically complete, whereas the metric of Y is invariant under translation; 

(ii) f € C(X) and is open at a linear rate on S; 
(Hi) int S 7^ 0; 

Then, also Y is metrically complete. 
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The above lemma is employed next to prove the following property useful in the sequel. 

Lemma 2.10. Let f : X — > Y be a mapping between Banach spaces, letflCX. be an open set and let xq G CI. 
Suppose that f G C(f2) and it is open at a linear rate around (xo, f(%o))- Then, there exists ro > such that, 
for every r G (0, ro] the set /(B (xo,r)) is closed. 

Proof. Since CI is open, it is possible to take f > in such a way that B(xo,r) C CI. Notice that, as / is 
continuous at xq, inclusion (|2.3p . valid owing to the linear openness of /, entails the existence of S > such 
that 

(2.4) /(B (x, r)) 2 B (f(x), ar) , Vx G B (x , s) , Vr G [0, 5). 

Indeed, corresponding to £ one can find 5q > such that 

/(B (n,« { ))CB(/(4(/2). 
Thus, by taking 5 > in such a way that 

5 < min jf, <5 f , jj , 

one obtains that, whenever x G B ^xo,<5^, it is /(a;) G B(a;o,C/2). It follows that, if t e (0,5], and hence 
erf < C/2, one has 

B(/(x),rf)CB(/(4C). 

Being 5 < 6, the last inclusion reduces (|2.3j) to (|2.4[) . Set ro — <5 and fix an arbitrary r g (0,ro]. Put in that 
form, openness at a linear rate around (xo, f(x )) implies openness on B (x 0l r/2), because only balls satisfying 
B {x, t)CB (xo, r/2) must be considered. One is then in a position to apply Lemma [231 with X — B (xo, r), 
S = B (xo, r/2) and Y = /(B (xo, r)). It follows that /(B (xo, r)) is metrically complete and, as such, it must 
be a closed subset of Y. This completes the proof. □ 



3. The Polyak convexity principle in Banach spaces 

Before entering the main result of the paper, to make easier the presentation of its proof, a fact concerning 
convexity of sets is explicitly stated, whose proof can be obtained without difficulty. 

Lemma 3.1. Let S C Y be a closed subset of a Banach space. S is convex iff \{yi + 2/2) G £7 whenever 
Vi, 2/2 e S. 

Theorem 3.2. Let f : X — > Y be a mapping between Banach spaces, let Q be an open subset o/X, let xq G O, 
and r > such that B (xo, r) C f2. Suppose that: 

(i) (X, || • ||) is uniformly convex and its convexity modulus fulfils condition (|2.1[) ; 
(m) / € C 1 ' 1 ^) and D/(x ) £ £(X, Y) is onto. 

Then, there exists cq € (0,r) such that /(B (xo,e)) is convex, for every e G [0, eo]. 

Proof. Under hypothesis (ii) it is possible to invoke the Lyusternik- Graves theorem. According to it, mapping 
/ is locally metrically regular around (xo, / (xo)). This means that there exist /i > 0, C > 0, and 5^ > such 
that 

(3.1) dist(x;r 1 ( 2 /)) <n\\y-f(x)\\, Vx G B (x , «y , Vy G B (/(x ), C) . 

By continuity of / at xo, corresponding to £ there exists (5^ > such that 

Ox) GB(/(x ),C), VxGB(xo,5 c ). 

Since / is continuous on and open at a linear rate around (xo, / (xo)), by virtue of Lemma 12.101 there exists 
r > such that /(B (xo, t)) is closed for every t G (0, r ]. Now, take eo in such a way that 

° <£0< min { r ' ro ' k > Mlip(D 8 /;0) + l) } ' 
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where c is as in (|2.1[) . and fix an arbitrary e € (0, eo], the case e = being trivial. In the light of Lemma |3.1[ 
in order to show that /(B (so, e)) is convex, it suffices to prove that, taken any pair yi, y 2 G /(B (xo, e)) and 
set 

_ Vi + 2/2 

y = ^—> 

then also y happens to belong to /(B (xg, e)). To this aim, corresponding to y\, y 2 , take X\, x 2 G B(a;o,e) 
such that f{x\) = y\ and f(x 2 ) = y 2 and define 

Xi + x 2 

Since it is e < 5^, the continuity of / at xq implies 

yi, yi g B (f(x ),C) , 

and hence y G B(/(:co),C)- Thus, since it is also e < <5 M , then, being (x,y) G B(xo,S t _ l ) x B (/(xo), 0> by 
recalling inequality (|3.1j) one obtains 

(3.2) dist^;/- 1 ^)) <»\\y-f(x)\\. 

If it is y = f(x), one achieves immediately what was to be proved. So, suppose that \\y — f(x)\\ > 0. From 
inequality (|3.2[) it follows that, corresponding to 2/x, there exists x G / _1 (y), such that 

\\x-x\\ <2n\\y-f(x)\\. 

In force of hypothesis (z) it is possible to apply the estimate (|2.2[) in Lemma [2~7l according to which one finds 

||f- 5 ||<2 M M^)|K-^. 
Therefore, since by the above positions it is 

M (ii P (D/ ; n) + i) i 

8c e' 

it results in 

- „ /_ C\\X! - X 2 \\ 2 

x G B X, ii — 

V e 

According to Lemma [2.41 this fact is known to imply that x G B(xo,e), by virtue of the condition (|2.1[) 
assumed on the convexity modulus of (X, || • ||). Thus, y has been proved to belong to /(B (xo, e)), so the proof 
is complete. □ 

Remark 3.3. (i) Since, as noticed in Remark l2.5[ every Hilbert space is an uniformly convex Banach space, 
whose modulus of convexity fulfils condition (|2.1|) , Thcorcm l3.2l is actually an extension of the Polyak convexity 
principle. Notice that no assumption on the geometry of the range space Y has been made. 

(ii) The regularity condition requiring Df(xa) to be onto can not be dropped out, even in the case of very 
simple mappings acting in finite-dimensional spaces. Consider, indeed, / : M 2 — > M 2 defined by 

f(x\,X2) = {{Xl + X2), (Xl + X2) 2 ), 

and xq = (0,0) = 0, R 2 being equipped with its usual Hilbert space structure. Mapping / G C 2 (M 2 ), so it 
belongs to C 1,:L (intB (0,r)), for a proper r > 0. Its Jacobian matrix has rank 1 at 0, so D/(0) can not cover 
W 2 . It is readily seen that, for every e > 0, it results in 

/(B (0,e)) = {(2/1,2/2) G M 2 : y 2 = yl yi G [-y/2e, V2e]}, 

which is not a convex subset of M 2 . Since, as a mapping defined on the Hilbert space R 2 , / satisfies all 
hypotheses of Theorem 13.21 this example shows also that a mapping carrying small balls to convex sets may 
happen to carry convex subsets of such balls to nonconvex sets. 

(iii) The next example shows that one can not hope to extend Theorem 13.21 out of the class of uniformly 
convex Banach spaces. Suppose IR 2 to be equipped with the norm ||a;||oo = max{|a;i|, |x2|}, which makes R 2 
not uniformly convex. Consider the mapping / : R 2 — > R 2 defined by 

f(xi,x 2 ) = (xi.xl + x 2 ), 
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and x = (0,0) = 0. Since / e C 2 (M 2 ), it is also C 1 ' 1 (intB (0,r)), for a proper r > 0. Moreover, being D/(x) 
represented by the matrix 

1 

2xi 1 

the linear mapping D/(a;) is onto for every x <E R 2 . Nonetheless, since now B (0, e) = [— e, e] x [— e, e], it results 
in 

/(B(0,e))= |J {( yi ,y 2 )eR 2 :y 2 = y 2 1 +t, Vl e [-e,e]}, 

te[-e,e] 

which can be convex only if e = 0. 

(iv) The following complement of Theorem 13.21 already remarked in |14j . is worth being mentioned. From 
hypothesis (ii) one has that /(mt B (xo, e)) Q int /(B (x , e)) ^ 0, for every e G (0, eo]. Therefore, it holds 

/- 1 (bd/(B (ar 0> e))) CbdB(ar ,e). 



An interesting question related to Theorem [32] is whether it can be extended to some classes of nonsmooth 
mappings. In consideration of the importance of nonsmooth analysis in optimization, this further development 
would be remarkable and widely motivated. Reduced to its basic elements, as a matter of fact, the proof of 
Theorem 13.21 consists in a proper combination of distance estimates relying on a rotund geometry and metric 
regularity. The latter has been well understood also for nonsmooth mappings and adequately characterized in 
terms of generalized derivatives (see, for a thorough account on the subject, [H]). Nonetheless, within the cur- 
rent approach, a developement in this direction seems to be hardly possible. In this regard, a counterexample 
has been devise by A.D. Ioffe showing that already C 1 mappings may happen to do not satisfy the thesis of the 
Polyak convexity principle. Apart from the need of Lipschitz continuity of the derivative mapping, another 
reason of difficulty is the role crucially played by linearity in the estimates provided by Lemma 12.71 as well as 
in preserving convexity of sets. In both such circumstances a successful replacement of linear mappings with 
merely positively homogeneous first order approximations seems to be hardly practicable. 

4. Applications to optimization 

4.1. A variational principle on the convex behaviour of extremum problems. Carrying on a reasearch 
line proposed in [Ml [15], this subsection is concerned with the study of local aspects of the theory of constrained 
optimization problems of the following form 

(V) minima;) subject to g(x) G C, 

where the cost functional ip : X — > M U {±oo}, the constraining mapping g : X — > Y and set C C Y arc 
given problem data. The feasible region of (P) is denoted here by R = {x g X : g{x) G C} = g _1 (C). Set 
Q = (— co, 0) x C C R x Y and fix xq e X. Following a wide-spread approach in optimization (see, among 
others, [6]), the analysis of various features of (V) can be performed by associating with that problem and 
with an element xq a mapping T-p. XQ : X — > Ex Y, defined as 

(4.1) lv,x (x) = {(f(x) - (p(x ),g(x)). 

Such mapping allows one to characterize the optimality of Xo, as stated in the below remark. 
Remark 4.1. An element xq £ R is a local solution to (V) iff there exists r > such that 

Xp,x (B (x o ,r))nQ = 0. 

Letting B (xq, +oo) = X, the above disjunction with r = +oo obviously characterizes global optimality of Xq. 

Such characterization is applied in the next result to establish a variational principle involving the classical 
Lagrangian function L : Y* x X — > E U {±00} 

L(y*;x) = <p(x) + (y*,g(x)). 

Given e > and Xq G R, by a e -localization of problem (V) around Xq, the following extremum problem is 
meant 

('Pxo.e) mm <p(e) subject to g{x) e C. 

i£B(io,t) 
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Notice that (V xo e ) has the same objective function as (P), but its feasible region results from B (xo, e) fl R. 

An element xo G R is said to be regular for (V) if (p, g G C 1,1 (f2), where fl is an open set containing Xq, 
and mapping Xp ;Xo is regular at xq in the classical sense, i.e. mapping ~D(<p, g)(xo) is onto. 

The variational principle, which is going to be presented next, states that, in an adequate setting, around 
each regular point for (V) and corresponding to each e small enough, there exists a e-localization of (V) 
admitting a solution, which further minimizes L(y*; ■), for a proper y* G Y*. 

Theorem 4.2. With reference to problem (V), let CI C X be an open set and let xo G RC\ fl. Suppose that 

(i) (X, || • ||) is uniformly convex and its convexity modulus fulfils condition (|2.ip : 

(ii) (Y, || • ||) is a reflexive Banach space; 

(Hi) <p, g G C 14 (fi) and D(ip,g)(x ) G £(X,M x Y) is onto. 

Then, there exists a positive to such that for every e € (0, eo] there are x e € bd B (xq, e) and A e G Y* wii/i £/ie 
properties: 

(4.2) x e solves problem (V Xo ^), 

(4.3) A e eN( (z £ ),C), 
and 

(4.4) L(A e ;x e ) < L(A e ;x), Vx£B(i ,f). 

Proof. Consider mapping I-p,x '■ ^ — ^ Mx¥ associated with problem (V) according to (|4.1I) . Under the 
current hypotheses Theorem 13.21 ensures the existence of eo > such that X-p jXo (B (xo, e)) is convex for every 
e G (0, eo]. Fix an arbitrary e G (0, eo] and define 

(4.5) r = inf{i: (t, y) G l v , Xo (B (aio, e)) n Q}. 

Notice that, since xo can not be a solution to (T 3 ) (nor even a local one) because of hypothesis (Hi), then 
Xp ]!C[) (B (xo, t)) fl Q / 0. For the same reason, it is readily seen that 

r = inf{* : (t, y) G X v , Xo (B (x , e)) n ((-oo, 0] x C)}. 

According to Lemma [2.101 set T-p iX0 (B (xo, ej) can be assumed to be closed. As a closed convex set, by the 
Mazur's theorem it is also weakly closed and, by continuity of /, bounded. Therefore, being (Y, || • ||) reflexive, 
X-p iXo (R (xq, e)) turns out to be weakly compact. Since (— oo,0] x C is weakly closed as well, again by the 
Mazur's theorem, then also I-p,x (B (^Oj e)) H ((— oo, 0] x C) turns out to be a weakly compact subset oflx Y. 
The projection function (t, y) t— > t is continuous and convex and thereby it is also lower semicontinuous with 
respect to the weak topology, again as a consequence of the Mazur's theorem. Thus, the innmum in (|4.5[) 
is actually attained. In other words, there exists (t,y) G T-p iX0 (B (xo, e)) n Q, where t = r. By definition of 
X-p^ XQ , this means that there exists x G B (xo, e) such that 

i = ip(x) - ^(xo) < 0, y = g(x) G C. 

Therefore it is possible to set x t = x to get the first assertion in the thesis. Indeed, assume ab absurdo the 
existence of x G B (xq, e) n R such that ip(x) < <p(x). Then, it follows 

ip(x) - (p(x ) = tp(x) - (p(x) + ip(x) - (p(x ) < <p(x) - <p(x ) = t. 

Consequently, it is (ip(x) — ip(xo), g(x)) G X-p jXo (B (xo,e)) D Q, but such an inclusion clearly contradicts the 
definition of r. Observe that, being (t, y) G bdX-p tXo (B (xq, e)), then according to what noticed in Remark 13.31 
(in), it is x G bd B (xq, e). 

The second part of the thesis is a straightforward consequence of the first one. Note that, by optimality of 
x, one has 

X TiX (B (x o ,e))nQ = 0. 

Moreover, being 

lv,x(x) =I-p,x (x) +w Xo , x , Vx G X, 

where w XQtX — (ip(xo) — (p(x),0), then Xj> tX (B (xq, e)) is a mere translation of X-p tXa {Q (xo,e)). Thus, set 
Xj>£ (B (xq, e)) is a convex subset of 1 x f with nonempty interior (recall Remark 13.31 (Hi)) and disjoint 
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from Q. According to the Eidelheit's theorem (see, for example, [H]), it is then possible to linearly separate 
Zp,x(B (xq, e)) and cl Q, what means that there exist (p e , A e ) G (R x Y*)\{(0, 0*)} and a£l such that 

(4.6) p e (ip(x) - ip(x)) + (X e ,g(x)) > a, Vx G B (x , e) , 
and 

(4.7) Pet+(\e,y) <a, V(t,2/)eclQ=(-oo,0]xC. 
If taking x = x in inequality (14. 6p . one finds 

(K,g{x)) > a. 

On the other hand, being (0, g(x)) G cl Q, from inequality (|4.7|) one gets 

(A e ,5(i)) < a, 

whcrefrom one deduces 

(4.8) (A 6 , <?(£)>= a. 

If taking now an arbitrary y G C, then being (0, j/) G cl Q, by inequality (|4.7p it results in 

(A e ,y) < a, 

and hence 

(X e ,y-g(x)) <0, VyeC. 

This shows that A e G N(g(i), C). Again, since (—l,g(x)) G clQ, from inequality (14. 7|) it follows that p e > 0. 
Let us show now that, under the current hypotheses, actually it is p e > 0, so up to a rescaling of A c it is 
possible to take p e = 1. Indeed, assume to the contrary that p e = 0. Since by virtue of hypothesis (Hi) 
mapping g is metrically regular around (xo,g(xo)), one has 

g(B (x , r)) D B (5(^0), or) 

for positive cr and r < e. From inequality 114. 6|) it follows 

(A e ,g(a;o) + rj u ) > Vu G §, 

with < r) < ar. Being (0, g(x j) G cl Q, owing to (|4.7|) one has 

(A e , 9(2:0)) < a. 

Thus, one finds 

r)(\ e ,u) > a- (X e ,g(x )) > 0, Vu G S, 

which can not be consistent with the fact that A e 7^ 0* (remember that (p e , A £ ) G (R x Y*)\{(0, 0*)}). 
Finally, by using equality (|4. 8[) in (|4.6|) , one obtains 

L(A £ ; x) = (p(x) + (X e ,g(x)} > tp(x) + (X e ,g(x)) = L(A e ; x ), Vx G B (s , e) . 

This completes the proof. □ 

Remark 4.3. (i) In a finite dimensional setting the existence of a solution to (V X0t6 ) is automatic, as an 
obvious consequence of the Weierstrass theorem. In that case, indeed, set B (xo,e) n R is compact, g being 
continuous near a^o- If X is infinite dimensional the solution existence becomes a by-product of the convexity 
hidden in the problem localization. Observe that, since B (xo, e) H R is not necessarily convex, it may fail to 
be weakly closed. Analogously, since <p is not convex, nothing can be said about its weak lower semicontinuity. 
Therefore, arguments based on weak compactness in a reflexive space can not be invoked directly, without 
passing through the convexity principle. 

(ii) The optimality condition expressed by (|4.3[) and (|4.4[) can be regarded as another manifestation of the 
convexity behaviour of (V Xo ,e)- The property for a solution to be minimal also for the Lagrangian function 
L(A e ; •), while it is typical in convex optimization, is a circumstance generally failing in nonlinear programming. 

(iii) A feature of Theorem 14.21 to be underlined is that such result guarantees the existence of regular 
Lagrange multipliers, i.e. multipliers with nonnull first component p e . 



ON THE POLYAK CONVEXITY PRINCIPLE AND ITS APPLICATION TO VARIATIONAL ANALYSIS 



11 



4.2. Lagrangian duality. In its general form, a Lagrangian duality scheme can be defined whenever the 
following elements are given: a function jSf : A x B — > R U {±00}, where A and B are arbitrary sets, and 
subsets Sa A and Sb ^= B. In considering the extremum problems 

(Ps?) min sup Jz? (a; b) 

beS B a£S A 

and 

Pse) max inf ^(a;6), 

a Lagrangian duality scheme singles out two fundamental concepts: one is the duality gap, i.e. the difference 
of the respective optimal values of the problems (V se ) and (V^e ) 

min sup Jzf (a; 6) — max inf Jz?(a;6), 
beS Bae s A a€S A b€S B 

provided that such difference is defined (it is, if (V5?) and (V^>) do not happen to have the same infinite 
optimal value). The other one is a saddle point for Jzf, i.e. any element (a, b) £ Sa x Sb satisying the 
inequalities 

% (a; b) < 3?(a; b) < &(a; b), V(a, b) £ S A X S B - 

In this context, any function such as .jSf is usually called the Lagrangian function associated with the duality 
scheme. In such a general setting, the following well-known proposition explains the role of the aforementioned 
concepts (for its proof, which is elementary, see for instance [3]) 

Proposition 4.4. Whenever it is defined, the duality gap is nonnegative, that is 

sup inf Jzf (a; 6) < inf sup Jzf (a; b). 

aes A fc es B beS B aeSA 

Moreover, the function Jzf admits a saddle point iff problems (Ps?) and (V^>) share the same optimal value 
and each has nonempty set of optimal solutions. In that case the set of saddle points for Jz? coincides with the 
Cartesian product of the respective optimal solution sets. 

In view of the above result, it becomes crucial to find out verifiable conditions on problem data, under 
which a saddle point exists. 

Now, as a consequence of Theorem l4.21 it turns out that a Lagrangian duality scheme, through a localization 
of problem (V), can be performed by making use of the simplest type of Lagrangian function, namely the 
linear one, provided that C is a cone. Thus, in such event, the extremum problems in duality are 

{Vh) min sup L(y*;x) 

xeB(x ,e) j / * g c , e 

and 

(VI) max inf L(y*;x). 

y*eCe igB(i ,e) 

Theorem 4.5. With reference to problem (V), suppose that C is a nonempty closed convex cone and Xq € R. 
Under the hypotheses of Theorem there exists e > such that for every e £ (0, eo] there are (x e ,A e ) € 
bdB (xo, e) x (C e fl g(x e ) ± ) such that (cc e , A e ) is a saddle point for L. Consequently, the related duality gap is 
and both the primal and the dual problem have nonempty solution sets. 

Proof. It is readily seen that if C is a cone and g(x e ) € C, the inclusion A e € N(g(x e ),C) implies A e € 
C e H g(x e ) ± . Indeed, if taking y = 2g(x e ) and y — in the inequality 

(X e ,y-g{x e )) <0, 

one obtains two inequalities, which can be consistent only if A e € g(x e )- L . Taking this fact into account, the 
last inequality gives also A c £ C e . Take e £ (0,eo], where eo is as in Theorem 14.21 By applying inequality 
P~4")l . one finds 

L(X;x e ) = ip(x e ) + (X, g(x e )) < ip(x e ) — L(X e ;x e ) <L(A e ;x), V(A,x) £ C e xB(x ,e). 
The last assertion in the thesis immediately follows from Proposition 14.41 □ 
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4.3. Problem calmness. This subsection focuses on some properties of constrained extremum problems in 
the presence of perturbations. The perturbation analysis of optimization problems has revealed to be able to 
afford useful theoretical insights into the very nature of the issue. The format of parametric problems here in 
consideration is as follows 

(V y ) xam.(p{x) subject to g(x)+y£C, 

where y G Y plays the role of a parameter. The corresponding feasible region is given therefore by R(y) = 
g~ l (C — y) . A notion capturing a sensibility behaviour with respect to perturbations near a reference value is 
that of problem calmness. Proposed by R.T. Rockafellar, such notion appeared firstly in [4] and since then it 
was largely employed in perturbation analysis of optimization problems and related fields. 

Definition 4.6. With reference to a class of problems (Vy), let x G R(0) be a solution to (Vo). Problem (Vo) 
is said to be calm at x if there exists a constant r > such that 

■ f ■ r <P(%) ~ . 

mi mf 7. — 7. > — oo. 

y£rB\{0} xeR(y)nB{x,r) \\y\\ 

Following a successful approach to this topic, sufficient conditions for problem calmness can be achieved by 
studying the localized (optimal) value function associated with (V y ), i.e. function val Xo ,e : Y — > RU {±00} 
defined by 

val Xo , e (j/) = inf <p(x). 

x£_R(y)nB(x ,e) 

In particular, the property of val XOje to be calm from below at appeared to be adequate to this aim. Recall 
that a function : Y — > K U {±00} is said to be calm from below at yo if yo G dom.<f> and it holds 

.. . . <t>(y) ~ 0(yo) ^ 

limmf — n — > —00. 

y^yo \\y-yo\\ 

In turn, calmness from below for function can be easily obtained from the subdiffcrcntiability property. 

Theorem 4.7. Given a class of perturbed problem (Vy), let xq G -R(O) H where Q is an open subset o/X. 
Suppose that: 

(i) (X, || • ||) is uniformly convex and its convexity modulus fulfils condition (|2.1j) : 

(ii) (Y, || • ||) is a reflexive Banach space; 

(Hi) ip, g G C 14 (0) and B(ip,g)(x a ) G £(X,R x Y) is onto. 

Then, there exists a positive eo such that for every e G (0, eo] it holds 

dval Xo , e (0) ^ 0. 

Consequently, function val XOie is calm from below at 0. 

Proof. The subdifferentiability of vala; 0je at can be achieved as a further consequence of the possibility of 
separating I-p iXo (B (x ,e)) and clQ for every e G (0, eo], where eo is a positive constant as in Theorem 14.21 
Indeed, fix an arbitrary y G Y. By applying Theorem 14.21 to (Vq), one gets x t G bdB(xo,e) and A e G Y* 
satisfying (|43|) . (|Oj) and (|44l) . It follows 

(4.9) (X e ,g(x e ))+<p(x e )<tp(x) + {\ e ,g(x)}, Vx £ B (x , e) . 

Because of (|4.3[) . whenever x G R(y) H B (x , e), being g(x) +y G C one has 

(X e ,g(x) - g(x e )) < -(A e ,y). 
The last inequality on account of (I4.9|) gives 

< ip(x) - ip(x e ) + (K,g(x) - g(x e )) < ip(x) - <p(x e ) - (X e ,y), Vx G R(y) n B (x Q ,e) , 

whence 

(X e ,y) < inf ^(a;) - ip(x e ) = val Xo , e (y) - val xo>e (0). 

i6%)nB(i ,e) 

By arbitrariness of y G Y the first assertion in the thesis is proved. The second one is a straightforward 
consequence of the first one. Indeed, obviously G domval XOi£ and it holds 

liminf ^1^1 > M(K.u) > -\\X e \\ > -co. 

y^O \\y\\ ueS 
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This completes the proof. □ 

Corollary 4.8. Under the hypotheses of Theorem \4-7\ there exists eo > such that, for every e G (0, eo], {Vq) 
admits a corresponding e-localization, which is calm at a respective solution x e € H(xo,e). 

Proof. According to Theorem 14. 2 [ an eo > exsists such that, for every e £ (0, eo], each e-localization of (Vo) 
admits a solution x e G bdB (xq, e). Thus, fixed r > 0, using the calmness from below of function val XOi£ at 0, 
as it holds by definition 

ip(x) > val Xo>e (y), Vx € R(y) n B (x , e) , 

one obtains 

■ f ■ r <p(x)-<p(x e ) . . , val xo , e (y) - val Xo , e (0) 

ml mi — > mi — > — oo. 

8/£rB\{0}xeil(j/)nB(a ; o,e)nB(a :e ,r) \\y\\ y erM\{0} \\y\\ 

The proof is complete. □ 
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